{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mDV6gkLfRQFE"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/jeffheaton/app_deep_learning/blob/main/t81_558_class_10_3_transformer_timeseries.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TkjMQljGRQFG"
      },
      "source": [
        "# T81-558: Applications of Deep Neural Networks\n",
        "**Module 10: Time Series in PyTorch**  \n",
        "\n",
        "* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)\n",
        "* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fjiBljzcRQFG"
      },
      "source": [
        "# Module 10 Material\n",
        "\n",
        "* Part 10.1: Time Series Data Encoding for Deep Learning, PyTorch [[Video]](https://www.youtube.com/watch?v=CZi5Avp6p1s&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](t81_558_class_10_1_timeseries.ipynb)\n",
        "* Part 10.2: LSTM-Based Time Series with PyTorch [[Video]](https://www.youtube.com/watch?v=hIQLy5zCgH4&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](t81_558_class_10_2_lstm.ipynb)\n",
        "* **Part 10.3: Transformer-Based Time Series with PyTorch** [[Video]](https://www.youtube.com/watch?v=NGzQpphf_Vc&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](t81_558_class_10_3_transformer_timeseries.ipynb)\n",
        "* Part 10.4: Seasonality and Trend [[Video]](https://www.youtube.com/watch?v=HOkxoLaUF9s&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](t81_558_class_10_4_seasonal.ipynb)\n",
        "* Part 10.5: Predicting with Meta Prophet [[Video]](https://www.youtube.com/watch?v=MzjMVsz0GyA&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](t81_558_class_10_5_prophet.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wfX80RDjRQFH"
      },
      "source": [
        "# Google CoLab Instructions\n",
        "\n",
        "The following code checks that Google CoLab is and sets up the correct hardware settings for PyTorch.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "_YJNQP1WRQFH",
        "outputId": "817aa75f-41e8-41fe-ee05-2ca8c5ac07b5"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Note: using Google CoLab\n",
            "Using device: cuda\n"
          ]
        }
      ],
      "source": [
        "try:\n",
        "    import google.colab\n",
        "    COLAB = True\n",
        "    print(\"Note: using Google CoLab\")\n",
        "except:\n",
        "    print(\"Note: not using Google CoLab\")\n",
        "    COLAB = False\n",
        "\n",
        "# Make use of a GPU or MPS (Apple) if one is available.  (see module 3.2)\n",
        "import torch\n",
        "\n",
        "device = (\n",
        "    \"mps\"\n",
        "    if getattr(torch, \"has_mps\", False)\n",
        "    else \"cuda\"\n",
        "    if torch.cuda.is_available()\n",
        "    else \"cpu\"\n",
        ")\n",
        "print(f\"Using device: {device}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "t_tys-e1RQFI"
      },
      "source": [
        "# Part 10.5: Transformers for TimeSeries in PyTorch\n",
        "\n",
        "The transformative landscape of deep learning has witnessed monumental strides in the recent past, particularly in the domain of Natural Language Processing (NLP). Central to this revolution has been the advent of transformer architectures, which, with their attention mechanisms, have pushed the boundaries of what's achievable in tasks like machine translation, sentiment analysis, and language modeling. However, while transformers initially rose to prominence primarily within the realm of NLP, their applicability isn't restricted to just textual data. A growing wave of interest has emerged around leveraging these models for time-series predictions—a challenge that, though numerically distinct, bears conceptual resemblance to understanding sequences in language.\n",
        "\n",
        "In time-series prediction, the objective often centers around forecasting future values based on historical data. This could involve predicting stock prices, weather patterns, or even the consumption of electricity in a region. At its core, this is a sequence-to-sequence task, where the past values form an input sequence and the future values to be predicted form an output sequence. Now, consider the similarities with machine translation in NLP, where an input sequence (sentence) in one language is translated into an output sequence in another language. Both scenarios require the model to recognize patterns, interdependencies, and context across sequences.\n",
        "\n",
        "This chapter delves deep into the nuances of using PyTorch transformers for time-series prediction. We will embark on this journey by first establishing a foundational understanding of how transformers operate within the NLP space, before segueing into their adaptation for numeric sequences. By juxtaposing these two applications, readers will gain a comprehensive appreciation of the transformer's versatility and the subtle considerations required when transitioning from text to time.\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Yllf6Ub-RQFJ"
      },
      "source": [
        "## Loading Sun Spot Data for a Transformer Time Series\n",
        "\n",
        "We will look at the same sunspot data as the previous section. However, this time we will use a transformer to predict. You can find the data files needed for this example at the following location.\n",
        "\n",
        "* [Sunspot Data Files](http://www.sidc.be/silso/datafiles#total)\n",
        "* [Download Daily Sunspots](http://www.sidc.be/silso/INFO/sndtotcsv.php) - 1/1/1818 to now.\n",
        "\n",
        "We use the following code to load the sunspot file:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "RvQ6vUqbE6r6"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "import pandas as pd\n",
        "import torch\n",
        "import torch.nn as nn\n",
        "from torch.utils.data import DataLoader, TensorDataset\n",
        "from sklearn.preprocessing import StandardScaler\n",
        "from torch.optim.lr_scheduler import ReduceLROnPlateau\n",
        "\n",
        "names = ['year', 'month', 'day', 'dec_year', 'sn_value',\n",
        "         'sn_error', 'obs_num', 'unused1']\n",
        "df = pd.read_csv(\n",
        "    \"https://data.heatonresearch.com/data/t81-558/SN_d_tot_V2.0.csv\",\n",
        "    sep=';', header=None, names=names,\n",
        "    na_values=['-1'], index_col=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bE0PP-lpNUB-"
      },
      "source": [
        "The data preprocessing is the same as was introduced in the previous section. We will use data before the year 2000 as training, the rest is used for validation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "us_X4qz3NVh9"
      },
      "outputs": [],
      "source": [
        "# Data Preprocessing\n",
        "start_id = max(df[df['obs_num'] == 0].index.tolist()) + 1\n",
        "df = df[start_id:].copy()\n",
        "df['sn_value'] = df['sn_value'].astype(float)\n",
        "df_train = df[df['year'] < 2000]\n",
        "df_test = df[df['year'] >= 2000]\n",
        "\n",
        "spots_train = df_train['sn_value'].to_numpy().reshape(-1, 1)\n",
        "spots_test = df_test['sn_value'].to_numpy().reshape(-1, 1)\n",
        "\n",
        "scaler = StandardScaler()\n",
        "spots_train = scaler.fit_transform(spots_train).flatten().tolist()\n",
        "spots_test = scaler.transform(spots_test).flatten().tolist()\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GvZXHfkGTPX6"
      },
      "source": [
        "Just like we did for LSTM in the previous section, we again break the data into sequences."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "id": "9utrJg62N1P-"
      },
      "outputs": [],
      "source": [
        "# Sequence Data Preparation\n",
        "SEQUENCE_SIZE = 10\n",
        "\n",
        "def to_sequences(seq_size, obs):\n",
        "    x = []\n",
        "    y = []\n",
        "    for i in range(len(obs) - seq_size):\n",
        "        window = obs[i:(i + seq_size)]\n",
        "        after_window = obs[i + seq_size]\n",
        "        x.append(window)\n",
        "        y.append(after_window)\n",
        "    return torch.tensor(x, dtype=torch.float32).view(-1, seq_size, 1), torch.tensor(y, dtype=torch.float32).view(-1, 1)\n",
        "\n",
        "x_train, y_train = to_sequences(SEQUENCE_SIZE, spots_train)\n",
        "x_test, y_test = to_sequences(SEQUENCE_SIZE, spots_test)\n",
        "\n",
        "# Setup data loaders for batch\n",
        "train_dataset = TensorDataset(x_train, y_train)\n",
        "train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)\n",
        "\n",
        "test_dataset = TensorDataset(x_test, y_test)\n",
        "test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o9TI-mjYTm9u"
      },
      "source": [
        "# Position Encoding for Transformers\n",
        "\n",
        "In the realm of the transformer architecture, a pivotal component that ensures the model's success is its ability to consider the sequence's order. Unlike traditional RNNs or LSTMs, which process sequences step-by-step and inherently respect their order, transformers process all tokens in a sequence simultaneously. While this parallel processing significantly boosts computational efficiency and allows for long-range dependencies to be captured more effectively, it also means that transformers, in their native form, are oblivious to the position or order of tokens in a sequence. This is where the concept of positional encoding comes into play.\n",
        "\n",
        "Positional encoding is a mechanism to provide the transformer with information about the position of tokens within a sequence. Essentially, it infuses order information into the otherwise position-agnostic embeddings. By adding positional encodings to the token embeddings before feeding them into the transformer, each token's position in the sequence becomes discernible to the model.\n",
        "\n",
        "Positional encodings are vectors that get added to the embeddings of tokens. The intuition is to design these vectors in such a way that their values or patterns are unique for each position, allowing the model to differentiate between different positions in the sequence.\n",
        "\n",
        "A popular method to generate positional encodings is using sinusoidal functions. For a given position $p$ in the sequence and dimension $d$ of the embedding, the positional encoding is computed as:\n",
        "\n",
        "$$ PE(2,i) = \\sin(\\frac{p}{10000^{2i/d}}) $$\n",
        "$$ PE(2,i+1) = \\cos(\\frac{p}{10000^{2i/d}}) $$\n",
        "\n",
        "Where $i$ is the dimension index. These sinusoidal functions generate values between -1 and 1 and ensure a unique and repeatable pattern for each position.\n",
        "\n",
        "The choice of sinusoidal functions isn't arbitrary. They have two compelling properties:\n",
        "\n",
        "1. They produce values between -1 and 1, making them compatible with most embedding value ranges.\n",
        "2. Their patterns allow the model to extrapolate positions beyond the sequence lengths seen during training.\n",
        "\n",
        "One might wonder, why not just append or add the token's position as an integer to the embedding? The challenge with this approach is scale. Embedding values, especially after being trained, can exist within a specific range, and directly adding large integers (for tokens further down in long sequences) might disrupt the information in the original embeddings.\n",
        "\n",
        "Furthermore, using raw integers wouldn't provide a consistent way for the model to generalize or extrapolate to sequence lengths not seen during training. Sinusoidal functions, in contrast, offer a predictable pattern that aids in such extrapolation.\n",
        "\n",
        "The following code describes a simple implementation of a transformer-based model using PyTorch's built-in functionalities. The **TransformerModel** class encapsulates a transformer-based neural network designed for sequence processing. Upon initialization, the model sets up several components: an encoder to adjust the input data to a desired dimension, a **pos_encoder** to bestow the sequence with positional information, a core **transformer_encoder** comprising several layers to process the sequence, and a **decoder** to produce the final output. As data flows through the model during the forward pass, it undergoes a series of transformations: it's first projected to a higher dimension, then augmented with positional encodings, processed by the transformer layers, and finally, the last token's representation is harnessed to produce the output. An instance of this model is readily created and can be assigned to a computation device for further training or inference.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "id": "wi-bRiyCN7Cw"
      },
      "outputs": [],
      "source": [
        "# Positional Encoding for Transformer\n",
        "class PositionalEncoding(nn.Module):\n",
        "    def __init__(self, d_model, dropout=0.1, max_len=5000):\n",
        "        super(PositionalEncoding, self).__init__()\n",
        "        self.dropout = nn.Dropout(p=dropout)\n",
        "\n",
        "        pe = torch.zeros(max_len, d_model)\n",
        "        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)\n",
        "        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-np.log(10000.0) / d_model))\n",
        "        pe[:, 0::2] = torch.sin(position * div_term)\n",
        "        pe[:, 1::2] = torch.cos(position * div_term)\n",
        "        pe = pe.unsqueeze(0).transpose(0, 1)\n",
        "        self.register_buffer('pe', pe)\n",
        "\n",
        "    def forward(self, x):\n",
        "        x = x + self.pe[:x.size(0), :]\n",
        "        return self.dropout(x)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jJR-CJLeOzpd"
      },
      "source": [
        "# Constructing the Transformer Model\n",
        "\n",
        "The following code constructs the actual transformer-based model for time series prediction. The model is constructed to accept the following parameters.\n",
        "\n",
        "* **input_dim**: The dimension of the input data, in this case we use only one input, the number of sunspots.\n",
        "* **d_model**: The number of features in the transformer model's internal representations (also the size of embeddings). This controls how much a model can remember and process.\n",
        "* **nhead**: The number of attention heads in the multi-head self-attention mechanism.\n",
        "* **num_layers**: The number of transformer encoder layers.\n",
        "dropout: The dropout probability.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "hoT-VFSdOANz"
      },
      "outputs": [],
      "source": [
        "# Model definition using Transformer\n",
        "class TransformerModel(nn.Module):\n",
        "    def __init__(self, input_dim=1, d_model=64, nhead=4, num_layers=2, dropout=0.2):\n",
        "        super(TransformerModel, self).__init__()\n",
        "\n",
        "        self.encoder = nn.Linear(input_dim, d_model)\n",
        "        self.pos_encoder = PositionalEncoding(d_model, dropout)\n",
        "        encoder_layers = nn.TransformerEncoderLayer(d_model, nhead)\n",
        "        self.transformer_encoder = nn.TransformerEncoder(encoder_layers, num_layers)\n",
        "        self.decoder = nn.Linear(d_model, 1)\n",
        "\n",
        "    def forward(self, x):\n",
        "        x = self.encoder(x)\n",
        "        x = self.pos_encoder(x)\n",
        "        x = self.transformer_encoder(x)\n",
        "        x = self.decoder(x[:, -1, :])\n",
        "        return x\n",
        "\n",
        "model = TransformerModel().to(device)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Mi_9wvWWfOGv"
      },
      "source": [
        "The Transformer architecture in PyTorch is governed by crucial configuration choices, among which **d_model**, **nhead**, and **num_layers** hold significant weight. The **d_model** denotes the dimensionality of the input embeddings and affects the model's capacity to learn intricate representations. While a more substantial **d_model** can bolster the richness of the model's understanding, it also amplifies the computational demand and can pose overfitting risks if not carefully chosen. Parallelly, the model's gradient flow and initialization are impacted by this choice, though the Transformer's normalization layers often moderate potential issues.\n",
        "\n",
        "On the other hand, **nhead** reflects the count of heads in the multi-head attention mechanism. A higher number of heads grants the model the prowess to simultaneously focus on diverse segments of the input, enabling the capture of varied contextual nuances. However, there's a trade-off. Beyond a specific threshold, the computational overhead might outweigh the marginal performance gains. This parallel processing, provided by multiple attention heads, tends to offer more stable and varied gradient information, positively influencing the training dynamics.\n",
        "\n",
        "Lastly, the **num_layers** parameter dictates the depth of the Transformer, determining the number of stacked encoder or decoder layers. A deeper model, as a result of increased layers, can discern more complex and hierarchical relationships in data. Still, there's a caveat: after a certain depth, potential performance enhancements may plateau, and the risk of overfitting might escalate. Training deeper models also comes with its set of challenges. Although residual connections and normalization in Transformers alleviate some concerns, a high layer count might necessitate techniques like gradient clipping or learning rate adjustments for stable training.\n",
        "\n",
        "In essence, these parameters intricately balance model capacity, computational efficiency, and generalization capability. Their optimal settings often emerge from task-specific experimentation, the nature of the data, and available computational prowess.\n",
        "\n",
        "## Training the Model\n",
        "\n",
        "\n",
        "Training a transformer-based model adheres to many of the familiar paradigms and best practices that apply to other neural network architectures. Much like the models we've encountered before, a transformer-based model benefits from training in batches, which helps in both computational efficiency and generalization. Batched training ensures that the model updates its weights based on the average gradient over several data points, rather than being excessively influenced by any single instance. Additionally, the use of early stopping acts as a safeguard against overfitting. By monitoring the model's performance on a validation set and halting training when no significant improvement is observed over a set number of epochs, we ensure that the model generalizes well and doesn't just memorize the training data. The validation set, it remains an essential component in the training regimen, providing a proxy measure of the model's performance on unseen data and guiding hyperparameter tuning. Thus, while transformer architectures introduce novel mechanisms and complexities, the foundational principles of training deep learning models in PyTorch remain consistent."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "g8-mF1OCOLnB",
        "outputId": "35d5c7d3-f1d4-49f3-ed9d-fc11bcf15349"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Epoch 1/1000, Validation Loss: 0.0421\n",
            "Epoch 2/1000, Validation Loss: 0.0512\n",
            "Epoch 3/1000, Validation Loss: 0.0429\n",
            "Epoch 4/1000, Validation Loss: 0.0376\n",
            "Epoch 5/1000, Validation Loss: 0.0386\n",
            "Epoch 6/1000, Validation Loss: 0.0403\n",
            "Epoch 7/1000, Validation Loss: 0.0358\n",
            "Epoch 8/1000, Validation Loss: 0.0395\n",
            "Epoch 9/1000, Validation Loss: 0.0385\n",
            "Epoch 10/1000, Validation Loss: 0.0368\n",
            "Epoch 11/1000, Validation Loss: 0.0353\n",
            "Epoch 12/1000, Validation Loss: 0.0376\n",
            "Epoch 13/1000, Validation Loss: 0.0390\n",
            "Epoch 14/1000, Validation Loss: 0.0344\n",
            "Epoch 15/1000, Validation Loss: 0.0362\n",
            "Epoch 16/1000, Validation Loss: 0.0408\n",
            "Epoch 17/1000, Validation Loss: 0.0343\n",
            "Epoch 18/1000, Validation Loss: 0.0396\n",
            "Epoch 19/1000, Validation Loss: 0.0357\n",
            "Epoch 20/1000, Validation Loss: 0.0355\n",
            "Epoch 00021: reducing learning rate of group 0 to 5.0000e-04.\n",
            "Epoch 21/1000, Validation Loss: 0.0434\n",
            "Early stopping!\n"
          ]
        }
      ],
      "source": [
        "# Train the model\n",
        "criterion = nn.MSELoss()\n",
        "optimizer = torch.optim.Adam(model.parameters(), lr=0.001)\n",
        "scheduler = ReduceLROnPlateau(optimizer, 'min', factor=0.5, patience=3, verbose=True)\n",
        "\n",
        "epochs = 1000\n",
        "early_stop_count = 0\n",
        "min_val_loss = float('inf')\n",
        "\n",
        "for epoch in range(epochs):\n",
        "    model.train()\n",
        "    for batch in train_loader:\n",
        "        x_batch, y_batch = batch\n",
        "        x_batch, y_batch = x_batch.to(device), y_batch.to(device)\n",
        "\n",
        "        optimizer.zero_grad()\n",
        "        outputs = model(x_batch)\n",
        "        loss = criterion(outputs, y_batch)\n",
        "        loss.backward()\n",
        "        optimizer.step()\n",
        "\n",
        "    # Validation\n",
        "    model.eval()\n",
        "    val_losses = []\n",
        "    with torch.no_grad():\n",
        "        for batch in test_loader:\n",
        "            x_batch, y_batch = batch\n",
        "            x_batch, y_batch = x_batch.to(device), y_batch.to(device)\n",
        "            outputs = model(x_batch)\n",
        "            loss = criterion(outputs, y_batch)\n",
        "            val_losses.append(loss.item())\n",
        "\n",
        "    val_loss = np.mean(val_losses)\n",
        "    scheduler.step(val_loss)\n",
        "\n",
        "    if val_loss < min_val_loss:\n",
        "        min_val_loss = val_loss\n",
        "        early_stop_count = 0\n",
        "    else:\n",
        "        early_stop_count += 1\n",
        "\n",
        "    if early_stop_count >= 5:\n",
        "        print(\"Early stopping!\")\n",
        "        break\n",
        "    print(f\"Epoch {epoch + 1}/{epochs}, Validation Loss: {val_loss:.4f}\")\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HS-lkHvOhXGm"
      },
      "source": [
        "We can now evaluate the performance of this model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "dE9fco0GOQr6",
        "outputId": "1591cc01-5e80-4250-db31-1b93317391fc"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Score (RMSE): 15.0481\n"
          ]
        }
      ],
      "source": [
        "# Evaluation\n",
        "model.eval()\n",
        "predictions = []\n",
        "with torch.no_grad():\n",
        "    for batch in test_loader:\n",
        "        x_batch, y_batch = batch\n",
        "        x_batch = x_batch.to(device)\n",
        "        outputs = model(x_batch)\n",
        "        predictions.extend(outputs.squeeze().tolist())\n",
        "\n",
        "rmse = np.sqrt(np.mean((scaler.inverse_transform(np.array(predictions).reshape(-1, 1)) - scaler.inverse_transform(y_test.numpy().reshape(-1, 1)))**2))\n",
        "print(f\"Score (RMSE): {rmse:.4f}\")"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "anaconda-cloud": {},
    "colab": {
      "gpuType": "T4",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3.9 (torch)",
      "language": "python",
      "name": "pytorch"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.18"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
